An Elastic-Phrase Model for Statistical Machine Translation

نویسندگان

  • Nicola Cancedda
  • Marc Dymetman
  • Cyril Goutte
چکیده

We present some on-going research on phrase-based Statistical Machine Translation using flexible phrases that may contain gaps of variable lengths. This allows us to naturally handle various linguistic phenomena such as negations or separable particles. We integrate this within the standard Maximum Entropy model using some dedicated feature functions, and describe a beam-search stack decoder that handles these noncontiguous, elastic phrases. Preliminary experimental results show that the translation performance compares favourably with phrase-based MT using fixed gap size. We expect that future results may allow us to leverage the added flexibility of elastic chunks to further increase translation performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل ترجمه عبارت-مرزی با استفاده از برچسب‌های کم‌عمق نحوی

Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...

متن کامل

A Generalized Reordering Model for Phrase-Based Statistical Machine Translation

Phrase-based translation models are widely studied in statistical machine translation (SMT). However, the existing phrase-based translation models either can not deal with non-contiguous phrases or reorder phrases only by the rules without an effective reordering model. In this paper, we propose a generalized reordering model (GREM) for phrase-based statistical machine translation, which is not...

متن کامل

Phrase-Based Statistical Machine Translation: A Level of Detail Approach

The merit of phrase-based statistical machine translation is often reduced by the complexity to construct it. In this paper, we address some issues in phrase-based statistical machine translation, namely: the size of the phrase translation table, the use of underlying translation model probability and the length of the phrase unit. We present Level-Of-Detail (LOD) approach, an agglomerative app...

متن کامل

NUT-NTT statistical machine translation system for IWSLT 2005

In this paper, we present a novel distortion model for phrase-based statistical machine translation. Unlike the previous phrase distortion models whose role is to simply penalize nonmonotonic alignments[1, 2], the new model assigns the probability of relative position between two source language phrases aligned to the two adjacent target language phrases. The phrase translation probabilities an...

متن کامل

An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation

Attempts to estimate phrase translation probablities for statistical machine translation using iteratively-trained models have repeatedly failed to produce translations as good as those obtained by estimating phrase translation probablities from surface statistics of bilingual word alignments as described by Koehn, et al. (2003). We propose a new iteratively-trained phrase translation model tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007